day4 : k8s建置(下)

2021 iThome 鐵人賽

DAY 4

DevOps

玩耍開源k8s30天系列第 4 篇

13th鐵人賽

永和ㄦ

2021-09-04 17:05:26

1764 瀏覽

分享至

昨天成功的用terraform創建gcp的instance出來，也透過terraform自動的把instance ip output出來，今天就來利用這些資源用ansible佈建kubernetes吧。

ansible的運行仰賴了ssh的機制，雖然內建了sudo的切換機制但是我為簡化部署作業，我在terrform佈建時設計了root的ssh key置入，因此control center的控管要特別注意，

開始配置

ansible總共有三個元件cfg設定檔、inventory，yaml，透過ansible.cfg設定ansible執行者的環境(需要注意ansible會以當前的cfg > default cfg)，其中為了簡易部署作業，我調整了以下幾個設定

inventory = 自己的路徑/inventory
roles_path = 自己的路徑/roles
host_key_checking = False
become=True
become_method=sudo
become_user=root
become_ask_pass=False
分別目的是指定我的inventory和roles路徑，不需要再check knows_host，下面幾個是為了防止我的失誤下的保險，幫我允許不需密碼sudo並且sudo為root。

接著編輯inventory檔案，inventory對ansible來說是一個list清單(可以同時存在很多份清單)，他幫助ansible了解他要執行作業的主機有哪些，以及主機有哪些variable可以使用，我這邊編輯了一份範例

[k8s_cluster:children]
k8s_ha
k8s_node

[k8s_ha]
ha_IP  node=ha_service

[k8s_node:children]
k8s_control_plane
k8s_computer
k8s_new_computer

[k8s_control_plane:children]
primary_control_plane
replicas_control_plane

[primary_control_plane]
control_plane_01_IP  node=control_plane_01

[replicas_control_plane]
control_plane_02_IP  node=control_plane_02
control_plane_03_IP  node=control_plane_03

[k8s_computer]

computer_01_IP  node=computer_01
computer_02_IP  node=computer_02

[k8s_new_computer]

我替這個cluster編輯了一個群組ha+node，接著用同樣的階層概念去管理底下的node，這樣未來要進行升級、管理都會容易很多，然後我替每個node加入的variable “node”，這樣我就可以讓ansible替kubernetes的node帶入更多參數，同時搭配昨天terraform的output ip未來也可以自動的將新的node加入這份清單進行add node。

接著就可以開始編寫ansble佈建kubernetes的yaml了，kubernetes的佈建我分為4塊
1 環境配置
2 ha配置
3 control plane安裝
4 add node
5 配置cni

環境配置
我運用以下範例進行安裝套件、移除swap、關閉selinux、關閉防火牆、調整modprobe

    - name: install package
      yum:
        name:
          - bash-completion
          - net-tools
          - lsof
          - unzip
          - nc
        state: latest
      when: inventory_hostname in groups['k8s_node']
    - name: Remove swapfile from /etc/fstab
      mount:
        name: swap
        fstype: swap
        state: absent
    - name: set selinux config
      lineinfile:
        path: /etc/selinux/config
        regexp: SELINUX=enforcing
        line: 'SELINUX=disabled'
      when: inventory_hostname in groups['k8s_node']
    - name: disable firewalld
      systemd:
        name: firewalld
        state: stopped
        enabled: no
    - name: Load br_netfilter module
      modprobe:
        name: br_netfilter
        state: present
      register: br_netfilter

因範例太長，為節省空間將簡述其餘作業，請嘗試使用ansible shell module或是參考上面的module模仿完成看看吧 (提示 ansible-doc module name可以看example，ansible-doc -l可以查詢目前的版本有哪些module可以用)

echo br_netfilter >> /etc/modules-load.d/br_netfilter.conf
echo "net.bridge.bridge-nf-call-iptables=1" >> /etc/sysctl.conf
echo “vm.swappiness = 0” >> /etc/sysctl.d/k8s.conf
echo “vm.overcommit_memory = 1” >> /etc/sysctl.d/k8s.conf
echo “net.bridge.bridge-nf-call-iptables = 1” >> /etc/sysctl.d/k8s.conf
echo “net.ipv4.ip_forward = 1” >> /etc/sysctl.d/k8s.conf
echo “net.ipv4.tcp_max_syn_backlog=2621440” >> /etc/sysctl.d/k8s.conf
yum install cri-o kubeadm kubelet kubectl (注意 : 套件庫控管很重要，這邊建議控制好自己使用的版本，所以要準備自己的repo檔案，可以用template module統一)
cp harber.crt 到各主機
調整/etc/crio/crio.conf的stream_tls_cert憑證
調整/etc/crio/crio.conf的seccomp_profile 指向/usr/share/containers/seccomp.json
調整/etc/crio/crio.conf的pause_image指定到自己的image repo
調整/etc/sysconfig/kubelet為KUBELET_EXTRA_ARGS=--cgroup-driver=systemd
啟動cri-o、kubelet service

安裝及設定haproxy
同樣透過ansible的yum module，較為特殊的是這邊要介紹ansible的templatefile用法，有時候需要複製一份帶有variable的檔案不太可能臨時編修一份，這樣實在是太費工了，在使用ansible template module時他會將來源端的檔案中的變數依據規則寫到目的端，如下範例

這是haproxy的中間一段

frontend kubernetes
{% for host in groups['k8s_ha']%}
    bind {{hostvars[host].ansible_facts.default_ipv4.address}}:6443
{% endfor %}
    option tcplog
    mode tcp
    default_backend kubernetes-master-nodes
backend kubernetes-master-nodes
    mode tcp
    balance roundrobin
    option tcp-check
{% for host in groups['k8s_control_plane']%}
    server {{hostvars[host].node}} {{hostvars[host].ansible_facts.default_ipv4.address}}:6443 check fall 3 rise 2
{% endfor %}

根據這個template file的內容，會用k8s_control_plane這個group的清單去將清單內的node的ip依序寫進haproxy的檔案中
結果會是

    server control_plane_01 control_plane_01_IP:6443 
    server control_plane_02 control_plane_02_IP:6443
    server control_plane_03 control_plane_03_IP:6443

安裝第一台control plane
爲什麼要把這一步驟獨立出來呢，因為第一台control plane身兼之後讓node join的角色，我使用kubeadm的方式進行佈建，為了搭配ansible我最終選擇將一個shell腳本丟到control plane主機上執行，他的內容大致上如下

kubeadm init --service-cidr 10.96.0.0/12 --pod-network-cidr 172.16.0.0/16 --apiserver-advertise-address 0.0.0.0 --control-plane-endpoint “${ha_ip}:6443" --upload-certs --v=5 > /root/kubeadm.log

相關kubeadm參數我這邊就不多闡述了，請參考https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/，爲什麼我會選擇使用ansible再去執行shell這樣多一道工序呢？其實主要是在於我嘗試了幾次ansible直接使command有時候會發生不穩定的錯誤，所以最後選擇穩定的佈建為主，讓每次的結果都相同是我認為比較重要的，安裝完成後將主機上的檔案透過ansible fetch module的方式搬回control center做為接下來的部屬元件使用，範例如下

    - name: "Fetching Kubernetes Master PKI files from primary "
      fetch:
        src: /etc/kubernetes/pki/{{item}}
        dest: /tmp/kubernetes/pki/{{item}}
        flat: yes
      with_items:
        - ca.crt
        - ca.key
        - sa.key
        - sa.pub
        - front-proxy-ca.crt
        - front-proxy-ca.key 還會需要/etc/kubernetes/pki/etcd/的ca檔案和/etc/kubernetes/的admin.conf。

join node
這邊我會進行兩階段，第一階段是join control plane第二階段是join computer node，在join control plane之前會需要將剛剛的/tmp/kubernetes/下抓回來的檔案放到要join的control plane主機上的/etc/kubernetes/，然後在第一座control plane上使用以下兩個command取得join control plane集群的指令(這邊請嘗試使用ansible registry的功能做看看)

kubeadm token create --print-join-command
kubeadm init phase upload-certs --upload-certs

兩段指令結合--control-plane --certificate-key 正確的情況下，會得到像是下面的指令

kubeadm join IP:6443 --token xxxxxxx   --discovery-token-ca-cert-hash sha256:xxxxxxx  --control-plane --certificate-key yyyyyy

這樣在要join的control plane上執行就會自動的開始加入cluster，join node則是只需要第一個指令就可以了。

最後在master主機上執行kubectl get node --kubeconfig ~/.kube/config就可以開始使用kubernetess囉。

網路配置
對kubernetes而言，安裝好的時候並不會提供cni給pod去使用，所以需要配一個提供網路的服務，有很多可以選擇的像是flannel、calico，cilium等等...，我以calico為範例進行佈署。

做法其實非常的簡單，只要將官方提供的yaml和需要的image抓下來後kubectl apply -f即可。
官方yaml
curl https://docs.projectcalico.org/manifests/calico.yaml -O
需要注意的是，在安裝完成calico之後建議重新啟動coredns的服務，避免服務無法彼此之間存取。
kubectl -n kube-system rollout restart deployment coredns

要講完ansible的概念和功能其實短短一篇是不可能完成的，同樣kubeadm的install也不是這樣一點小篇幅就能說明清楚，今天折衷的選擇了講述如何利用ansible將kubeadm要安裝kubernetes時的步驟說明完成，這也是一開始想要表達的利用terraform + ansible 這種宣告式的腳本工具讓每一次的結果都完全相同。